Tightly Integrating Relational Learning and Multiple-Instance Regression for Real-Valued Drug Activity Prediction

نویسندگان

  • Jesse Davis
  • Soumya Ray
چکیده

We present a new machine learning approach for 3D-QSAR, the task of predicting binding affinities of molecules to target proteins based on 3D structure. Our approach predicts binding affinity by using regression on substructures discovered by relational learning. We make two contributions to the state-of-the-art. First, we use multiple-instance (MI) regression, which represents a molecule as a set of 3D conformations, to model activity. Second, the relational learning component employs the “Score As You Use” (SAYU) method to select substructures for their ability to improve the regression model. This is the first application of SAYU to multipleinstance, real-valued prediction. We evaluate our approach on three tasks and demonstrate that (i) SAYU outperforms standard coverage measures when selecting features for regression, (ii) the MI representation improves accuracy over standard single feature-vector encodings and (iii) combining SAYU with MI regression is more accurate for 3D-QSAR than either approach by itself.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multiple-Instance Learning of Real-Valued Data

The multiple-instance learning model has received much attention recently with a primary application area being that of drug activity prediction. Most prior work on multiple-instance learning has been for concept learning, yet for drug activity prediction, the label is a real-valued affinity measurement giving the binding strength. We present extensions of k-nearest neighbors (k-NN), Citation-k...

متن کامل

Multiple Fuzzy Regression Model for Fuzzy Input-Output Data

A novel approach to the problem of regression modeling for fuzzy input-output data is introduced.In order to estimate the parameters of the model, a distance on the space of interval-valued quantities is employed.By minimizing the sum of squared errors, a class of regression models is derived based on the interval-valued data obtained from the $alpha$-level sets of fuzzy input-output data.Then,...

متن کامل

Real-Valued Multiple-Instance Learning with Queries

While there has been a significant amount of theoretical and empirical research on the multiple-instance learning model, most of this research is for concept learning. However, for the important application area of drug discovery, a real-valued classification is preferable. In this paper we initiate a theoretical study of real-valued multiple-instance learning. We prove that the problem of find...

متن کامل

Salience Assignment for Multiple-Instance Regression

We present a Multiple-Instance Learning (MIL) algorithm for determining the salience of each item in each bag with respect to the bag’s real-valued label. We use an alternating-projections constrained optimization approach to simultaneously learn a regression model and estimate all salience values. We evaluate this algorithm on a significant real-world problem, crop yield modeling, and demonstr...

متن کامل

Relational Instance Based Regression for Relational Reinforcement Learning

The full paper on this topic appears in the Proceedings of the Twentieth International Conference on Machine Learning. [1] Q-learning [6] is a model free approach to tackle reinforcement learning problems which calculates a Qualityor Q-function to represent the learned policy. The Q-function takes a state-action pair as input and outputs a real number which indicates the quality of that action ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007